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Abstract 

The estimation of variance-based importance measures (called Sobol' indices) of the 
input variables of a numerical model can require a large number of model evalu- 
ations. It turns to be unacceptable for huge model involving a large number of 
input variables (typically more than ten). Recently, Sobol and Kucherenko have 
proposed the Derivative-based Global Sensitivity Measures (DGSM), defined as the 
integral of the squared derivatives of the model output, showing that it can help to 
solve the problem of dimensionality in some cases. We provide a general inequality 
link between DGSM and total Sobol' indices for input variables belonging in the 
class of Boltzmann probability measures, extending the previous results of Sobol 
and Kucherenko for uniform and normal measures. The special case of log-concave 
measures is also described. This link provides a DGSM-based maximal bound for 
the total Sobol indices. Numerical tests show the performance of the bound and its 
usefulness in practice. 

Keywords: Boltzmann measure; Derivative based global sensitivity measure; 
Global sensitivity analysis; Log-concave measure; Poincare inequality; Sobol' 
indices 

1 1. Introduction 

2 With the advent of computing technology and numerical methods, computer 

3 models are now widely used to make predictions on miskwnown physical phenom- 

4 ena, to solve optimization problems or to perform sensitivity studies. These complex 

5 models often include hundreds or thousands uncertain variables as inputs, whose un- 

6 certainties can strongly impact the model outputs (De Rocquigny et al. If, Kleijnen 
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1 Ql, Patelli etal. [3|). In fact, it is well known that, in many cases, only a small 

2 number of input variables really act in the model (Saltelli et al. ^). This number 

3 is referred to the notion of the effective dimension of a function (Cafiish et al. (5|), 

4 which is a useful way to deal with the curse of dimensionality in practical applica- 

5 tions. 

6 Global Sensitivity Analysis (GSA) methods (Sobol 6||, Saltelli et al. aim 

7 to apportion model output variability into input variables and their interactions. 

8 It is also an objective way to determine the effective dimension by using the model 

9 simulations (Kucherenko et al. A first class of GSA methods is qualitative and is 

10 called the "screening", as it aims to deal with a large number of input variables (from 

11 tens to hundreds). An example of screening method is the Morris' method (Morris 

12 j8|). With a few model evaluations, it allows a coarse estimation of the main effects 

13 but misses out interactions among input variables. The second class of GSA methods 

14 are the popular quantitative methods, mainly based on the decomposition of the 

15 model output variance, which leads to the so-called variance-based methods and 

16 Sobol' sensitivity indices. It allows computing the main and total effects (called first 

17 order and total Sobol' indices) of each input variable, as interaction effects. However, 

18 the estimation procedures are more expensive in terms of number of required model 

19 evaluations. Then, variance-based methods can only be applied to model with a 

20 small number of input variables (no more than tens). 

nn 

21 Recently, Sobol and Kucherenko |9|, 110| have proposed a rigorous mathemati- 

22 cal formulation of the Morris method by the use of the so-called Derivative-based 

23 Global Sensitivity Measures (DGSM). DGSM seem computationally more tractable 

24 than variance-based measures and less suffer the curse of dimensionality. They also 

25 theoretically showed an inequality linking DGSM and total Sobol' indices in the case 

26 of uniform or Gaussian input variables. 

27 In this paper, we investigate this close relationship between total Sobol' indices 

28 and DGSM, by extending this inequality to a large class of Boltzmann probability 
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1 measures. We also obtain result for the class of log-concave measures. The paper is 

2 organized as follows. Section H] recalls some useful definitions of Sobol' indices and 

3 DGSM. Section [3] establishes an inequality between these indices for a large class 

4 of Boltzmann (resp. log-concave) probability measures. Section H] provides some 

5 numerical simulations on two test models, illustrating how the bound can be used 

6 in practice. We conclude in Section [5l 

7 2. Global sensitivity indices definition 

8 2.1. Variance-based sensitivity indices 

9 Let Y = /(X) be a model output with d random input variables X = (Xi, . . . , X^). 

10 If the input variables are independent (assumption Al) and E(/^(X)) < +oo (as- 

11 sumption A2), we have the following unique Hoeffding decomposition (Efron and 



Stein 



ill) of/(X): 



d d 

/(X) = fo + J2f3iX,) + J2UX^,X,) + ... + hUXi,...,X,) (2.1) 

j i<j 

= Yl /-(^")' (2.2) 

tiC{l,2,...d} 

13 where /q = E [/(X)] corresponds to the empty subset; fj{Xj) = E[/(X)|Xj] — /q 

14 and fu{X^) = E [/(X)|X„] - ^ f^^X^) for any subset u C {1,2, . . . , d} . 

15 By regrouping all the terms in equation (12.11) that contain the variable Xj (j = 

16 1,2, ... ,d) in the function called g{-): 

^(X„X.,) = J]A(XJ, 

17 we have the following decomposition: 

/(X) = /o + ^(X,,X.,) + MX.,), (2.3) 

18 where X^j denotes the vector containing all variables except Xj. Further set h{-) = 
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1 /(■) — /o — g{-)- Notice that this decomposition is also unique under assumptions Al 

2 and A2. The function g{-), itself, suffices to compute the total sensitivity indices. 

3 Indeed, it contains all information about the variable Xj. 

Definition 2.1. Assume that Al, A2 hold, let //(X) = //(Xi, . . . , X^) he the dis- 
tribution of the input variables. For any non empty subset u C {1, 2, . . . , rf}, set 
first 

D = j f{x)dfi{x) - fl , 

^« = y /n(x«)c?^(x„) , 

4 

Du' = I (2.4) 

vDu 

5 Futher, the first order Sobol sensitivity indices (Sobol J^J) ofX.y^ is 



Su = ^, (2.5) 



6 The total sensitivity Sobol index o/X„ (Homma and Saltelli JlBj) is 

r\tot 

St. = ^- (2.6) 

7 Note that D is nothing more than the variance of f(X.). 

8 The following proposition gives another way to compute the total sensitivity 

9 indices. 

10 Proposition 2.1. Under assumptions Al and A2, the total sensitivity indices of 

11 variable Xu with u = {j} (j = 1,2, ... ,d) is obtained by the following formulas: 

Df = j g\xj,x^,)dfi{x) (2.7) 

12 and 

Df = iy"[/(x)-/(x;,x.,)]'dMx)dM^;.). (2.8) 

13 Proof 2.1. The first formula is an obvious consequence of equation and it is 

14 obtained by using the orthogonality constraints among terms in the decomposition 



15 of equation Ii2.1\) . Indeed, = / /^(x„)(i/i(x^;) = / 



-\ 2 

dii{'x) 
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2.2. Derivative- based sensitivity indices 

Derivative-based global sensitivity method uses the second moment of model 
derivatives as importance measure. This method is motivated by the fact that a 
high value of the derivative of the model output with respect to some input variable 
means that a big variation of model output is expected for a variation of the variable. 
This method extends the Morris method (Morris (8|). Indeed, it allows to capture 
any small variation of the model output due to a single input variable. 



8 DGSM have been first proposed in Sobol and Gresham 



been largely studied in Kucherenko et al. 

et al 



14| . Then, they have 



15j , Sobol and Kucherenko 9| and Patelli 



16] . from now, we assume that the function / is differentiable. Two kind of 
DGSM are defined below: 



Definition 2.2. Assume that Al holds and that 



is square-integrable (as- 



sumption A3). Then, for j 



1,2, 



, d, we define the DGSM indices by: 



E 



g/(X) 

dxj 



(2.9) 



14 
15 



Let w{-) is be a bounded measurable function. A weighted version of the last indices 
is: 



dm 

dxi 



w{xj)dfi{x.). 



(2.10) 



Remark 2.1. Sobol and Kucherenko fldi] showed that, for a specific weighting ft 

1 ~\~ j 



unc- 



tion w[Xj 
variable, we have To 



6 



and for a class of linear model with respect to each input 



i ■ 



19 
20 
21 
22 



Remark 2.2. By bearing in mind the decomposition in equation ^2.3) . we can re- 
place in equations 1^2.91 ) and ^2. 1(A) the function /(■) by the function g{-). In general, 
g{-) is a di (di< d) dimension function, and this can drastically reduce the number 
of model evaluations for the numerical computation of p or r. Thus, we have: 



dg{x] 



dxj 



dfi{x.) 



dXn 



w{xj)dfi{yi) 



(2.11) 
(2.12) 
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3. Variance-based sensitivity indices vs. derivative-based sensitivity in- 
dices 

A formal link between total Sobol' indices and DGSM is worth interesting to 
control Sobol' indices and to use the DGSM in practice for factors prioritization. 
Indeed, DGSM estimations need much less model evaluations than total Sobol' in- 
dices estimations (Kucherenko etal. 15|). Sobol and Kucherenko (9| have estab- 
lished an inequality linking these two indices for uniform and Gaussian random 
variables (maximal bound for St^). In this section, we extend the inequality for 
any model when the marginal distribution of input variable are Boltzmann measure 
on M (assumption A4). A measure 5 on M is said to be a Boltzmann measure if 
it is absolutely continuous with respect to the Lebesgue measure and its density 
d6{x) = p{x)dx = cexp[—v{x)]dx. Here v{-) is a continuous function and c a nor- 
malizing constant. Many classical continuous probability measures used in practice 
(see de Rocquigny et al. 1| and Saltelli et al. ^) are Boltzmann measures. 

The class of Boltzmann probability measures includes the well known class of log- 
concave probability measures. In this case, v{-) is a convex function (assumption 
A5). In other words, a twice differentiable probability density function p{x) is said 
to be log-concave if, and only if, 

^[logp(x)] <0. (3.13) 

Note that the density of the uniform probability measure on an interval is not con- 
tinuous on M. So it cannot be considered in the class of log-concave probability 
measure, nor in the class of Boltzmann probability measures. 

The two following propositions give the formal link between Sobol' indices and 
derivative-based sensitivity indices. 

Theorem 3.1. Under assumptions Al, A2, A3 and A4, we have: 

Df < C{pj)u, (3.14) 

6 



1 with C{^j) = ACi and Ci = sup ^—^ the Cheeger constant, Fj( 

2 the cumulative probability function of Xj and pj{-) the density function ofXj. 



We recall the four assumptions: 

• Al: independence between inputs Xi, X2, . . . , X^, 

• A2: / G 



A3: e L\ 

dxj 



7 • A4: the distribution of Xj is a Boltzmann probability measure. 

8 Proof 3.1. The inequality Ili3.14\ ) is a one- dimensional Poincare inequality by using 

g^(xj.x^,j)dfi(x) (equation Ii2.1\) ). anduj = J y~Q — ~J dfi{x) 



10 (equation The constant is obtained in Bobkov jl Tlj . and Fougeres fla] for 

11 the one- dimensional Poincare inequality. A proof of the d-dimensional Poincare 

12 inequality is given in Bakry et al. flMj- 

13 Theorem 3.2. Under assumptions Al, A2, A3 and A5, we have: 

Df < [exp(^;(m))]Vj , (3.15) 

14 with Ci = ^ the Cheeger constant and m the median of the measure jjij 

15 (such that fi{Xj < m) = fJ.{Xj > m)). 

16 We recall the assumption A5: the distribution of Xj is a log-concave probability 

17 measure. 

18 Proof 3.2. See proofWli 

19 Table [T] shows the Cheeger constant for some probability distributions that are 

20 log-concave and are used in practice of uncertainty and sensitivity analysis. We 

21 also give their medians and the functions v{-). We obtain the same results for the 

22 normal distribution A/'(yU, a^) than Sobol and Kucherenko 9| but we prove them in 



23 another way (in 

24 and Kucherenko 



]his case, v{m) = log(o")). For uniform distribution U[ab], Sobol 
obtained via direct integral manipulations the inequality < 
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Distribution 



m 



Normal A/'(yU, 



Exponential S{X), 
A > 

Beta B{a,/3), 
a,f3>l 



+ log(a) 



Ax — log(A) 



log [x^-^'ii-xy-^] 



Gamma T{a,P), log (x^""r(a)) + - + alog/3 No expression 



log 2 
A 

No expression 



a 
2 

1 
A 



scale a > 1, shape 
^ > 

Gumbel Q{jj,l3), 
scale (3 > 



+ log p + exp 



Weibull W{k,X), logf^j +(l-A;)log(^) + (^)' A(log2)^/'= 
shape A; > 1, ^ ^ 

scale A > 



(3 



^-/31og(log2) 



/3 



log 2 

A(log2)(i-'=)/^ 
k 



Table 1: w(-) Standard log-concave probability distributions: w(-) function, median m and Cheeger 
constant Ci (see Theorem [ 



(b cz)^ 

Y^^y This relation is a the classical Poincare or Writtinger inequality (Ane 



TT 

2 et al. 



3 For general log-concave measures, no analytical expressions are available for the 

4 Cheeger constant. In this latter case or in case of non log-concave but Boltzmann 

5 measure, we can obtain the Cheeger constant by numerical simulation using the 



6 expression sup^gjg 



mm{Fj{x),l-Fj{x)) 
Pj (^) 
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1 4. Numerical tests 

2 Derivative sensitivity indices estimates 

3 For the DGSM, a classical estimator of DGSM indices is the empirical one and 



4 is given below: 



xperimental convergence properties of this estimator are given in Sobol and Kucherenko 



a. 



9/(XW) <9^(X(*)) 

7 From definition ( 12.31) . we know that — = — . Estimator of 

OXj OXj ■' 

8 (see equation 02. 7p ) and estimator fl4.16p are based on the same function g{-) and 

9 it seems that estimations of these two indices will require approximately the same 

10 number of model evaluations in order to converge towards their respective values. 

11 Computations of DGSM and Sobol' indices can be done by using Monte Carlo 

12 algorithm or any variates from it, such as Latin Hypercube Sampling, quasi-Monte 

13 Carlo and Monte Carlo Markov Chain sampling. Kucherenko et al. (l5| have shown 

14 that quasi-Monte Carlo outperforms Monte Carlo when model has a low effective 

15 dimension. Computation of DGSM needs model gradient estimation. For com- 

16 plex models, model gradient computation can easily be obtained by finite difference 

17 method. Patelli and Pradlwarter jsl propose a Monte Carlo estimation of gradient 

18 in high dimension. They used an unbiased estimator for gradients and have shown 

19 that the number of Monte Carlo evaluations n < d is sufficient for gradient com- 

20 putations. In the worst case, their procedure requires the same number of model 

21 evaluations than the finite difference method. The method is very efficient when 

22 the model has a low effective dimension. When there are many dominate gradient 

23 values, an orthogonal linear transformation allows to be in a new space with a few 

24 dominate variables. 

25 In the following Sections, we compare the estimates of the Sobol indices {Sj and 
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1 Stj) and the upper bound of St^ (see inequality ( 13.141) ). let denote Tj, the total 

2 sensitivity upper bound: 

T, = , (4.17) 

3 where D is the variance of the model output /(X) and K = ACf. The goal of our 

4 numerical tests is just to compare the differences in terms of ranking and not to 

5 study the speed of convergence of the estimates. 

6 4- 2- Test on the Morris function 

7 As a first test, we consider the Morris function (Morris 8|) that includes 20 

8 independent and uniform input variables. The Morris function is defined by the 

9 following equation: 

20 20 20 20 

2/ = /?0 + X] + X] l^id'^i^j + 5Z + (3ij^i^sWiWjWiWs , (4.18) 

i=l i<j i<j<l i<j<l<s 

10 where Wi = 2 — except for ? = 3, 5, 7 where Wi = 2 ^1.1 ^ ^ — . The 

11 coefficient values are: 

12 /3i = 20 for z = 1,2, . . .,10, 

13 f^ij = — 15 for z, J = 1, 2, . . . , 6, i < j 

14 = ~10 for i, j,l = 1,2, ... ,5, i < j < I 

15 and 2,3,4 = 5. 

16 The remaining first and second order coefficients were generated independently from 

17 the normal distribution A/'(0, 1) and the remaining third and fourth coefficient were 

18 set to 0. 

19 We replace the uniform distributions associated to several input variables by 

20 different log-concave measures of the Table [T] in order to show how the bounds can 

21 be used in practical sensitivity analysis. Table [2] shows the probability distributions 

22 associated to each input of the Morris function. 

23 We have performed some simulations that allow computing the DGSM indices 
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Input Probability distribution Input Probability distribution 



XI 


W[0,1] 


Xll 


W[0,1] 


X2 


Ar(0.5,0.1) 


X12 


7\A(0.5,0.1) 


X3 


£(4) 


X13 


£(4) 


X4 


^(0.2,0.2) 


X14 


e?(0.2,0.2) 


X5 


>V(2,0.5) 


X15 


>V(2,0.5) 


X6 


W[0,1] 


X16 


W[0,1] 


X7 


W[0,1] 


X17 


W[0,1] 


X8 


W[0,1] 


X18 


W[0, 1] 


X9 


W[0,1] 


X19 


W[0, 1] 


XIO 


W[0,1] 


X20 


W[0, 1] 



Table 2: Probability distributions of the input variables of the Morris function 



and the Sobol' indices for the 20 independent factors. The goal here is not to 
compare their algorithmic performances in terms of simulation number, but just 
to look at the inputs' ranking. SoboF indices Sj and Stj are obtained with the 



method of Saltelli 2l|, using two initial Monte Carlo samples of size 5 x 10^. With 
20 input variables, it leads to 1.1 x 10^ model evaluations. The total Sobol' indices 
are used in this paper as a reference. It shows that only the first 10 inputs have 
some influence. Model derivatives are evaluated for each input on a Monte Carlo 
sample of size 1 x 10^ by the finite-difference method (perturbation of 0.01%). Then, 
DGSM Vj require 2.1 x 10^ model evaluations. Tj is then computed using equation 
fl4.17p where the variance of the Morris function is estimated to = 991.521. The 
results are available in Table 121 

In Table |3l we can first observe that the total sensitivity upper bounds are 
always greater than the total sensitivity indices as expected. For each input, we 
distinguish several situations that can occur: 

1. First order and total Sobol' indices are negligible (inputs Xll to X20). In this 
case, we observe that the bound T j is always negligible. For all the inputs, 
this test shows the high efficiency of the bound: a negligible bound warrants 
that the input has no influence. 

2. First order and total Sobol' indices significantly differ from zero and have 
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Input 




s, 




sd 


Stj 




sd 




K 




T 


XI 


0, 


.046 


0, 


.008 


0, 


.172 





.004 


2043.820 


0.101 


0, 


.209 


X2 


0, 


.010 


0, 


.009 


0, 


.029 





.001 


2856.580 


0.01 


0, 


.029 


X3 





.070 


0, 


.008 


0, 


.166 





.003 


31653.270 


0.250 


7, 


.981 


X4 





.006 


0, 


.010 


0, 


.134 





.002 


2025.950 


0.333 


0, 


.680 


X5 


0, 


.037 


0, 


.009 


0, 


.054 





.002 


4203.060 


0.360 


1, 


.526 


X6 


0, 


.040 


0, 


.009 


0, 


.114 





.003 


1337.100 


0.101 


0, 


.137 


X7 


0, 


.070 


0, 


.008 


0, 


.068 





.002 


6605.960 


0.101 





.675 


X8 


0, 


.157 


0, 


.008 


0, 


.155 





.003 


1826.390 


0.101 


0, 


.187 


X9 


0, 


.191 


0, 


.008 





.191 





.004 


2249.770 


0.101 


0, 


.230 


XIO 


0, 


.149 


0, 


.008 





.147 





.004 


1730.400 


0.101 


0, 


.177 


Xll 


0, 


.003 


0, 


.009 


0, 


.002 





.000 


22.630 


0.101 


0, 


.002 


X12 


0, 


.003 


0, 


.009 


0, 


.000 





.000 


23.940 


0.01 


0, 


.000 


X13 


0, 


.003 


0, 


.009 


0, 


.001 





.000 


17.670 


0.250 


0, 


.004 


X14 


0, 


.004 


0, 


.009 


0, 


.003 





.000 


42.850 


0.333 


0, 


.014 


X15 


0, 


.003 


0, 


.009 


0, 


.001 





.000 


19.870 


0.360 


0, 


.007 


X16 





.003 


0, 


.009 





.002 





.000 


18.860 


0.101 





.002 


X17 


0, 


.003 


0, 


.009 


0, 


.002 





.000 


21.400 


0.101 





.002 


X18 





.003 


0, 


.009 





.002 





.000 


19.950 


0.101 


0, 


.002 


X19 





.006 


0, 


.009 





.005 





.001 


54.380 


0.101 


0, 


.006 


X20 





.004 


0, 


.009 





.003 





.001 


42.250 


0.101 





.004 



Table 3: Sensitivity indices (Sobol' and DGSM) for the Morris function. For the Sobol' indices Sj 
and Sxj, 20 rephcates has been used to get the standard deviation {sd). 



approximately the same value (inputs X7 to XIO). It means that the input 
has some influence but no interacts with other inputs. In this case, the bound 
is clearly relevant (except for X7). The interpretation of the bound gives a 
useful information about the total influence of the input. 
3. First order Sobol' index is negligible while total Sobol' index significantly dif- 
fers from zero (inputs XI to X6). In this case, the bound Tj largely oversti- 
mates the total Sobol' index Stj for X3, X4 and X5. However, for X4, we 
have T4 < 1 and this coarse information is still usefuU. For the three other 
inputs, the bound is relevant. 

For two inputs (X3 and X5), results can be judged as strongly unsatisfactory 
as the bound is useless (larger than 1 which is the maximal value for a sensitivity 
index). We suspect that these results come from: 
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• the model non hnearity with respect to these inputs (see equation ( I4.18P ). 

• the input distributions (exponential and WeibuU). 

The second explanation seems to be the more convincing as these types of dis- 
tribution can provide larger values during Monte Carlo simulations. In this case, 
departures from the central part of the input domain leads to uncontrolled derivative 
values of the Morris function. Indeed, it can be seen that Pj is particularly large for 
X3 and X5, because of high derivative values in the estimation samples. 

As a conclusion of this first test, we argue that the bound Tj is well-suited for 
a screening purpose. Moreover, coupling Tj interpretation with first order Sobol' 
indices 5*^- (estimated at low cost using a smoothing technique or a metamodel, see 



22|) can bring useful information about the presence or absence of interaction. 
For inputs following uniform or normal and exponential distributions, bound seems 
to be extremely efficient. In these particular cases, the bound is the best one and 
cannot be improved. 

4-3. A case study: a flood model 

To illustrate how the Cheeger constant can be used for factors prioritization, 
when we use the DGSM, we consider a simple application model that simulates the 
height of a river compared to the height of a dyke. When the height of a river 
is over the height of the dyke, flooding occurs. This academic model is used as a 



pedagogical example in looss j22j. The model is based on a crude simplification of 
the ID hydro-dynamical equations of Saint Venant under the assumptions of uniform 
and constant fiowrate and large rectangular sections. It consists of an equation that 
involves the characteristics of the river stretch: 

/ \ 0-6 

S = Zy + H - Hd-Cb with H = ( ^ | , (4.19) 
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1 with S the maximal annual overflow (in meters) and H the maximal annual height 

2 of the river (in meters). 

3 The model has 8 input variables, each one follows a specific probability distribu- 

4 tion (see Table H]). Among the input variables of the model, is a design parameter. 

5 The randomness of the other variables is due to their spatio-temporal variability, our 

6 ignorance of their true value or some inaccuracies of their estimation. We suppose 
that the input variables are independent. 

Input Description Unit Probability distribution 



Q 


Maximal annual fiowrate 


m'^/s 


Truncated Gumbel ^(1013, 558) on [500, 3000] 


Ks 


Strickler coefficient 




Truncated normal A/'(30, 8) on [15, +00 [ 




River downstream level 


m 


Triangular r(49, 50, 51) 




River upstream level 


m 


Triangular T(54, 55, 56) 




Dyke height 


m 


Uniform W [7, 9] 


a 


Bank level 


m 


Triangular T(55, 55.5, 56) 


L 


Length of the river stretch 


m 


Triangular r(4990, 5000, 5010) 


B 


River width 


m 


Triangular r(295, 300, 305) 



Table 4: Input variables of the flood model and their probability distributions 

7 

8 We also consider another model output: the associated cost (in million euros) of 

9 the dyke presence: 



Cp — ]Is>o + 



1000 

0.2 + 0.8 1 -exp~^ 



^s<o + ^ {H,^H,>s + 8I^,,<8) , (4.20) 



10 with llyi(a:) the indicator function which is equal to 1 for x E A and otherwise. In 

11 this equation, the first term represents the cost due to a fiooding {S > 0) which is 

12 1 million euros, the second term corresponds to the cost of the dyke maintenance 

13 {S < 0) and the third term is the investment cost related to the construction of the 

14 dyke. The latter cost is constant for a height of dyke less than 8 m and is growing 

15 proportionally with respect to the dyke height otherwise. 

16 Global sensitivity analysis was performed with 2x10^ model evaluations in order 

17 to compute SoboF indices (first order indices Sj and total indices 5*^^ ). We use Sobol 

18 sequences, as input samples, to simulate the input values. For estimating the DGSM 

14 



(z/j, weighted DGSM tj and the total sensitivity upper bound Tj), a Sobol sequence 
is also used with 1 x 10^ model evaluations. 

Results of global sensitivity analysis and derivative-based global sensitivity anal- 
ysis for respectively the overflow S and the cost Cp outputs are listed in Tables O 
and [6l Global sensitivity indices show small interaction among input variables for 
the overflow and the cost outputs. Four input variables {Q, Hd, Kg, Zy) drive the 
overflow and the cost outputs. This variable classification will serve as reference for 
comparison issue. 



Input 


5, 


Stj 






T ■ 


Q 


0.345 


0.353 


1.296e-06 


1.072 


2.807 




0.134 


0.142 


3.286e-03 


1.033 


0.198 


Zy 


0.190 


0.189 


1.123e+00 


1377.41 


0.561 


7 


0.003 


0.003 


2.279e-02 


33.742 


0.011 


Ha 


0.284 


0.283 


8.389e-01 


23.77 


0.340 


a 


0.035 


0.034 


8.389e-01 


1268.90 


0.105 


L 


0.000 


0.000 


2.147e-08 


0.268 


0.000 


B 


0.000 


0.000 


2.386e-05 


1.070 


0.000 



Table 5: Sensitivity indices for the overflow output of the flood model. 



Input 


S, 








T 


Q 


0.361 


0.478 


1.3906e-06 


2.013 


3.011e+00 


Ks 


0.160 


0.249 


8.5307e-03 


1.926 


5.129e-01 


Zy 


0.172 


0.219 


1.3891e+00 


1715.89 


6.932e-01 


Z-m 


0.007 


0.004 


4.6038e-02 


68.17 


2.29e-02 




0.121 


0.172 


1.5366e+00 


44.04 


6.227e-01 


a 


0.033 


0.036 


9.4628e-01 


1428.69 


1.180e-01 


L 


0.003 


-0.003 


4.0276e-08 


0.503 


2.009e-06 


B 


0.003 


-0.003 


4.4788e-05 


2.007 


5.587e-04 



Table 6: Sensitivity indices for the cost ouput of the flood model. 



Based on derivative sensitivity indices (z/j) or weighted derivative sensitivity in- 
dices (tj) we have obtained another subset of the most influential variables such as 
Zy, Cb, Hd, Zra- These results mean that, for example, the maximum annual flowrate 
(Q) does not have any impact on the overflow and the cost output. If we compare 



15 



these results to the global sensitivity indices, the results are obviously wrong. This 
is easily explained by the fact that the input variables have different unities and 
that the indices Uj and Tj do not take into account the probability distribution of 
Xj. However, the difference between total sensitivity St^ and weighted derivative 
sensitivity Tj seems to suggest the non-linearity between model output and its input 
variables. 

By looking at the total sensitivity upper bound Tj, the most influential variables 
are the following: Q, Zy, Ha, Kg for the overflow output and for the cost output. It 
gives the same subset of the most influential variables with some slight differences 
for the prioritization of the most influential variables. In conclusion, we confirm that 
in practice, if Sobol' indices cannot be estimated because of a cpu time expensive 
model, Tj can provide correct information on input variance-based sensitivities. 

5. Conclusion 

Global sensitivity analysis, that allows exploring numerically complex model 
and setting factors prioritization, requires a large number of model evaluations. 
Derivative-based global sensitivity method needs a much smaller number of model 
evaluations (gain factor of 10 to 100). The reduction of the number of model eval- 
uations becomes more significant when the model output is controlled by a small 
number of input variables and when the model does not include much interaction 
among input variables. This is often the case in practice. 

In this paper, we have produced an inequality linking the total Sobol' index and 
a derivative-based sensitivity measure for a large class of probability distributions 
(Boltzmann measures). The new sensitivity index Tj, which is defined as a constant 
times the crude derivative-based sensitivity, is a maximal bound of the total Sobol' 
index. It improves factor prioritization by using derivative-based sensitivities instead 
of variance-based sensitivities. 

Two numerical tests have confirmed that the bound T,- is well-suited for a screen- 
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1 ing purpose. When total SoboF indices cannot be estimated because of a cpu time 

2 expensive model, Tj can provide correct information on input sensitivities. Previous 

3 studies have shown that estimating DGSM with a small derivatives' sample (with 

4 size from tens to hundreds) allows to detect non influent inputs. In subsequent 

5 works, we propose to use jointly DGSM and first order Sobol' indices. With these 

6 information, an efficient methodology of global sensitivity analysis can be applied 

7 and brings useful information about the presence or absence of interaction (see looss 

8 et al. 
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